A Linearly Convergent Conditional Gradient Algorithm with Applications to Online and Stochastic Optimization
نویسندگان
چکیده
Linear optimization is many times algorithmically simpler than non-linear convex optimization. Linear optimization over matroid polytopes, matching polytopes and path polytopes are example of problems for which we have simple and efficient combinatorial algorithms, but whose non-linear convex counterpart is harder and admit significantly less efficient algorithms. This motivates the computational model of convex optimization, including the offline, online and stochastic settings, using a linear optimization oracle. In this computational model we give several new results that improve over the previous state of the art. Our main result is a novel conditional gradient algorithm for smooth and strongly convex optimization over polyhedral sets that performs only a single linear optimization step over the domain on each iteration and enjoys a linear convergence rate. This gives an exponential improvement in convergence rate over previous results. Based on this new conditional gradient algorithm we give the first algorithms for online convex optimization over polyhedral sets that perform only a single linear optimization step over the domain while having optimal regret guarantees, answering an open question of [20, 16]. Our online algorithms also imply conditional gradient algorithms for non-smooth and stochastic convex optimization with the same convergence rates as projected (sub)gradient methods.
منابع مشابه
Non-Uniform Stochastic Average Gradient Method for Training Conditional Random Fields
We apply stochastic average gradient (SAG) algorithms for training conditional random fields (CRFs). We describe a practical implementation that uses structure in the CRF gradient to reduce the memory requirement of this linearly-convergent stochastic gradient method, propose a non-uniform sampling scheme that substantially improves practical performance, and analyze the rate of convergence of ...
متن کاملStochastic Optimization with Variance Reduction for Infinite Datasets with Finite Sum Structure
Stochastic optimization algorithms with variance reduction have proven successful for minimizing large finite sums of functions. However, in the context of empirical risk minimization, it is often helpful to augment the training set by considering random perturbations of input examples. In this case, the objective is no longer a finite sum, and the main candidate for optimization is the stochas...
متن کاملA Linearly-Convergent Stochastic L-BFGS Algorithm
We propose a new stochastic L-BFGS algorithm and prove a linear convergence rate for strongly convex and smooth functions. Our algorithm draws heavily from a recent stochastic variant of L-BFGS proposed in Byrd et al. (2014) as well as a recent approach to variance reduction for stochastic gradient descent from Johnson and Zhang (2013). We demonstrate experimentally that our algorithm performs ...
متن کاملA Stochastic Quasi-Newton Method for Online Convex Optimization
We develop stochastic variants of the wellknown BFGS quasi-Newton optimization method, in both full and memory-limited (LBFGS) forms, for online optimization of convex functions. The resulting algorithm performs comparably to a well-tuned natural gradient descent but is scalable to very high-dimensional problems. On standard benchmarks in natural language processing, it asymptotically outperfor...
متن کاملHAMSI: A Parallel Incremental Optimization Algorithm Using Quadratic Approximations for Solving Partially Separable Problems
We propose HAMSI, a provably convergent incremental algorithm for solving large-scale partially separable optimization problems that frequently emerge in machine learning and inferential statistics. The algorithm is based on a local quadratic approximation and hence allows incorporating a second order curvature information to speed-up the convergence. Furthermore, HAMSI needs almost no tuning, ...
متن کامل